• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

±¹³» ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ±¹³» ³í¹®Áö > Çѱ¹Á¤º¸°úÇÐȸ ³í¹®Áö > Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)

Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)

Current Result Document :

ÇѱÛÁ¦¸ñ(Korean Title) ±×·¡ÇÁ ±â¹Ý ÁØÁöµµ ÇнÀÀ» ÀÌ¿ëÇÑ ¼Ó¼º°ª ÀüÆÄ °áÃøÄ¡ ÃßÁ¤
¿µ¹®Á¦¸ñ(English Title) Missing Value Imputation with Attribute Value Propagation using Graph-based Semi-Supervised Learning
ÀúÀÚ(Author) ½ÅÀ¯°æ   ½ÅÇöÁ¤   Yukyung Shin   Hyunjung Shin  
¿ø¹®¼ö·Ïó(Citation) VOL 25 NO. 10 PP. 0511 ~ 0516 (2019. 10)
Çѱ۳»¿ë
(Korean Abstract)
µ¥ÀÌÅÍÀÇ ·¹ÄÚµåµé Áß¿¡ Çϳª ÀÌ»óÀÇ ¼Ó¼º°ªÀÌ ¾ø´Â °æ¿ì´Â ºñÀϺñÀçÇÏ´Ù. ¸¹Àº °æ¿ì¿¡ À־ µ¥ÀÌÅÍÀÇ ¼ö ´ëºñ °áÃøÄ¡°¡ ¾ø´Â ¿ÏÀü·¹ÄÚµåÀÇ ¼öÀÇ ºñÀ²ÀÌ Àû´Ù. ÀÌ¿¡ ´ëÇÏ¿© Æò±Õ°ª, ÃÖºó°ª, ±×¸®°í Áß¾Ó°ª µîÀ¸·Î ´ëüÇÏ´Â Åë°èÀû ¹æ¹ýÀÌ °¡Àå º¸ÆíÀûÀ¸·Î ¾²ÀÌ°í ÀÖ´Ù. ¶ÇÇÑ ±â°èÇнÀ¿¡¼­µµ k-ÃÖ±ÙÁ¢ ÀÌ¿ôŽ»öÀ̳ª ÀÇ»ç°áÁ¤³ª¹« µîÀ» È°¿ëÇÑ °áÃøÄ¡ ÃßÁ¤¹æ¹ýµéÀÌ ÀÚÁÖ È°¿ëµÈ´Ù. ÀüÀÚ´Â °¢ ¼Ó¼ºÀÇ ´ëÇ¥ÇÏ´Â °ªÀ¸·Î ´ëüÇÏ´Â Àü¿ªÀû ¹æ¹ýÀε¥ ¹ÝÇØ ÈÄÀÚ´Â ÇØ´ç ·¹ÄÚµå¿Í À¯»çÇÑ ·¹ÄÚµåµéÀÇ ¼Ó¼º°ªÀ¸·Î ´ëüÇÏ´Â Áö¿ªÀû ¹æ¹ýÀ̶ó ÇÒ ¼ö ÀÖ´Ù. ±×·¯³ª ÇÑ ¼Ó¼ºÀÇ °ªÀÌ ´ëºÎºÐ °áÃøµÈ °æ¿ì¶ó¸é µÎ ¹æ¹ý ¸ðµÎ È°¿ëÇϱ⠾î·Æ´Ù. ÀÌ·¯ÇÑ ÇѰ踦 ±Øº¹Çϱâ À§ÇÏ¿©, º» ¿¬±¸¿¡¼­´Â °áÃøÄ¡ÀÇ ¼Ó¼º°ú »ó°ü¼ºÀÌ Å« ÀÌ¿ô ¼Ó¼ºµé·ÎºÎÅÍ °ªÀ» ÃßÁ¤ÇÏ´Â ¹æ¹ýÀ» Á¦¾ÈÇÑ´Ù. ¼Ó¼º °£ »ó°ü¼ºÀ» ±â¹ÝÀ¸·Î ÇÏ¿© ÇÑ ¼Ó¼ºÀÇ ´ëºÎºÐÀÇ °ªÀÌ °áÃøÀÌ µÇ´õ¶óµµ È°¿ë ÇÒ ¼ö ÀÖ´Ù. Á¦¾È ¹æ¹ý·ÐÀ¸·Î´Â ¼Ó¼ºµé °£ÀÇ »ó°ü°è¼ö·Î ÀÌ·ç¾îÁø »ó°ü ±×·¡ÇÁ¸¦ ¸¸µé°í, ±×·¡ÇÁ ±â¹Ý ÁØÁöµµ ÇнÀÀ» Àû¿ëÇÑ´Ù. °áÃøÄ¡´Â ´Ù¸¥ ¼Ó¼º°ªµé·ÎºÎÅÍ »ó°ü°è¼ö¿¡ ºñ·ÊÇÏ¿© ÀüÆĵǾî ÃßÁ¤µÈ´Ù. º» ³í¹®¿¡¼­ Á¦¾ÈÇÑ °áÃøÄ¡ ´ëü ÃßÁ¤ ¹æ¹ý°ú ±âÁ¸¿¡ °áÃøÄ¡ ´ëü¿¡ ¸¹ÀÌ »ç¿ëÇÏ´Â Åë°èÀû ¹æ¹ý°ú ±â°èÇнÀÀ» ºñ±³ÇÏ¿© ½ÇÇèÀ» ÁøÇàÇÏ¿´´Ù.
¿µ¹®³»¿ë
(English Abstract)
The number of data records without one or more attributes is very large. In many cases, few complete records are available without missing the data values. Statistical methods that replace the missing values with mean, mode and median are commonly used. In machine learning algorithms such as K-nearest neighborhood or decision tree, the missing values are replaced by estimation methods. The statistical method is a global method that replaces each attribute with a representative value, whereas the machine learning algorithm is a local method that replaces the attribute values similar to the records. However, it is difficult to use both methods for records that contain almost all the missing values. In order to overcome these limitations, in this paper, we propose a method to estimate values from neighborhood properties associated with large correlation with the missing attribute. It is based on correlation between attributes, and can be used even if the attributes carry almost missing values. In this proposed method, a correlation graph representing correlation coefficients related to attribute values was constructed based on graph-based semi-supervised learning. Missing values were estimated in proportion to the correlation coefficient derived from related attributes. In this paper, the proposed method compared the statistical method and machine learning algorithm, which are generally used for missing value imputation.
Å°¿öµå(Keyword) ÁØÁöµµ ÇнÀ   ±×·¡ÇÁ À̷Р  ±â°èÇнÀ   °áÃøÄ¡ ´ëü   semi-supervised learning   graph theory   machine learning   missing value imputation  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå